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Abstract 

Using the uniform most powerful unbiased test, we observed the sales distribution of consumer 
electronics in Japan on a daily basis and report that it follows both a lognormal distribution and 
a power-law distribution and depends on the state of the market. We show that these switches 
occur quite often. The underlying sales dynamics found between both periods nicely matched a 
multiplicative process. However, even though the multiplicative term in the process displays a size- 
dependent relationship when a steady lognormal distribution holds, it shows a size- independent 
relationship when the power-law distribution holds. This difference in the underlying dynamics is 
responsible for the difference in the two observed distributions. 
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I. INTRODUCTION 



Since Pareto pointed out in 1896 that the distribution of income exhibits a heavy-tailed 
structure [1], many papers has argued that such distributions can be found in a wide range 
of empirical data that describe not only economic phenomena but also biological, physical, 
ecological, sociological, and various man-made phenomena [2] . The list of the measurements 
of quantities whose distributions have been conjectured to obey such distributions includes 
firm sizes [3], city populations [4], frequency of unique words in a given novel [5-6], the 
biological genera by number of species [7] , scientists by number of published papers [8] , web 
files transmitted over the internet [9], book sales [10], and product market shares [11]. Along 
with these reports the argument over the exact distribution, whether these heavy-tailed 
distributions obey a lognormal distribution or a power-law distribution, has been repeated 
over many years as well [2] . In this paper we use the statistical techniques developed in this 
literature to clarify the sales distribution of consumer electronics. 

To illustrate the heavy-tailed distribution's appearance, random growth processes are 
widely used as the approximation of its underlying dynamics. Gibrat, who built upon 
Kapteyn and Uven's work, was the first to propose the simplest form of this type of model, 
usually known as the multiplicative process, to describe the appearance of heavy-tailed 
distributions in firm size distributions [12]. His work is significant in market structure 
literature [13]. Even 70 years after Gibrat's book, more and more measures of quantities are 
found that are conjectured to obey this type of process. 

Among recent works, Fu et al. [14] is crucial because it was perhaps the first work to 
consider the hierarchical structure of institutions. No one denies that firms grow in size and 
scope and that such growth is heavily infiuenced by the successful launch of new products. 
Fu et al. modeled products as elementary sales units assuming that they evolve based on a 
random multiplicative process. They extended the usual model of proportional growth to 
illustrate the size variance relationship found in growth distributions at different levels of 
aggregation in the economy by considering hierarchical structure. 

Many studies have analyzed product sales. Sornette et al. [10] used a book sales database 
from Amazon.com and performed a time series analysis of book sales by classifying endoge- 
nous and exogenous shocks. With a database of newspaper and magazine circulation, PicoUi 
et al. [15] used the multiplicative process to illustrate the link between tent-shaped log 
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growth distributions and the power-law distributions found in the growth of newspaper and 
magazine sales. However the exact dynamics of product sales remains an open question. 

In this paper we clarify the distribution and the dynamics of the sales of consumer elec- 
tronics using a unique database of product sales from the Japanese consumer electronics 
market. The data were recorded daily, making it possible to track the actual sales volume of 
each product in a more detailed manner and to model the dynamics from a more empirically 
driven approach. We numerically analyzed more than 1200 sales distributions recorded on a 
daily basis from October 1 2004 to February 29 2008. Using the uniform most powerful test, 
we statistically show that sale distributions differ among different periods and occasionally 
exhibit a power-law behavior. We also show with the multiplicative process that the under- 
lying ingredients of stochastic growth itself arc different among these periods. Moreover our 
findings are compatible with the mathematical results reported by Ishikawa et al. [16]. 

The paper is organized as follows. Section 2 provides an overview of our data set. Sec- 
tion 3 introduces the sales distribution of consumer electronics, and Section 4 illustrates our 
statistical technique regarding the verification of a true power-law distribution. Using this 
statistical technique, in Section 5, we show why the power-law behavior found in Section 3 
can be considered a genuine power-law behavior. Section 6 reports how sales distribution 
changes over time. Sales distribution exhibits both power-law and lognormal distributions. 
Section 7 focuses on the underlying dynamics of sales, providing another source of evidence 
that the dynamics of sales differs among different periods. Section 8 provides further dis- 
cussion and a conclusion. 



II. SALES DATA OF CONSUMER ELECTRONICS 

Consumer electronics chains sell products such as TVs, personal computers, audio devices, 
refrigerators, digital cameras, air conditioners, and DVD recorders. Their annual revenue 
amounts to 5.9 trillion yen in Japan. In this paper we investigate distribution using the sales 
data of digital cameras from 23 different consumer electronics chains in Japan collected by a 
private company called BCN Inc. This dataset covers about 45% of all consumer electronics 
chains in Japan including over 1,400 retail stores [17]. The data were recorded daily covering 
the period from October 1 2004 to April 30 2008. 
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III. SALES DISTRIBUTION OF CONSUMER ELECTRONICS 



We focused on the top selling products using cumulative distribution P>{S) defined as 



where f{x) describes the probability density function. The cumulative distribution of the 
sales volume of digital cameras on April 1 2005 is shown on a double logarithmic scale 
(Fig. 1). It exhibits a heavy-tailed structure. To investigate the exact characteristic of 
its distribution, we also depict the maximum likelihood estimate of a lognormal distribu- 
tion, assuming that all values above 1 obeyed a lognormal distribution and the maximum 
likelihood estimate of a power-law distribution, assuming that all values above 16 obeyed 
a power-law distribution. A lognormal distribution fits nicely for almost all points except 
the last three. For the points above 16 including these last three points, at first glance it 
seems that a power-law distribution fits better. In this paper we numerically judge whether 
a simple lognormal distribution or a lognormal distribution with a power-law tail displays 
a better fit using the statistical technique developed by Malevergne et al. [18]. The impor- 
tance of distinguishing between these two distributions lies in the fact that not only does 
the tail describe the top selling products but these products which seems to exist in the 
power-law region account for about 80% of total sales; identifying the dynamics of these top 
selling products is important. 

IV. TESTING A POWER-LAW DISTRIBUTION AT THE TAIL 

To judge whether a power-law distribution or a lognormal distribution displays a better 
fit for values over a threshold, one natural way is using a model selection technique between 
a power-law distribution and a singly truncated lognormal distribution that puts the trun- 
cation point identically as the lower bound of a power-law distribution (for instance, see 
Clauset et al. [19]). The basic change of the variables shows that a logarithm of a random 
variable, which obeys a power-law distribution, is an exponential distribution, but a loga- 
rithm of a singly truncated lognormal distribution is a singly truncated normal distribution. 
Hence the test of a power law against a singly truncated lognormal is equivalent to testing 
an exponential distribution against a singly truncated normal distribution in the log-size 
distribution of the original measure of quantity. 




(1) 
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Next, as shown by Castillo [20], an exponential distribution and a singly truncated normal 
distribution have the following relationship: 



fsTNix;a,f3,A) f^^p{x;X)l^^A as (a, /3) ^- (A, 0), (2) 

where A denotes the truncation point and a and (3 are the parameters of a singly truncated 
normal distribution with the following relationship: 

« ■= -- P ■= ^ (3) 



a 



where /i is the usual mean, a is the standard deviation. This implies that an exponential 
distribution is in the boundary line of a singly truncated normal distribution. This rela- 
tionship illustrates why a singly truncated normal distribution (singly truncated lognormal 
distribution) so closely resembles an exponential distribution (power-law distribution) if 13 
becomes incresaingly close to 0. Fig. 2 shows the maximum likelihood estimate assuming an 
exponential distribution and a singly truncated normal distribution for the log-size distri- 
bution of digital camera sales on April 1 2005, setting the truncation point as A — log{16). 
Observe from the maximum likelihood estimate that a singly truncated normal distribution 
with sufficiently small closely resembles an exponential distribution. 

Considering this relationship, a natural test to distinguish an exponential distribution 
from a singly truncated normal alternative is to test the departure from the exponential form 
(null hypothesis) against the singly truncated normal alternative (alternative hypothesis) 
using the hkelihood ratio test that evaluates statistic 

W - 2{L{e) - L0)) (4) 

where L denotes log- likelihood function 9 — [a, /3), which is the maximum likelihood estimate 
under the full model, and fi' = (A, 0) = (|, 0) in its exponential form. Castillo and Puig [21] 
showed the following: 1) the likelihood ratio test is the uniform most powerful unbiased test 
in this case; 2) the hkelihood ratio test could easily be performed using the chpped coefficient 
of variation (i.e. c = min{l,c}, where c is the empirical coefficient of variation); and 3) 
the critical region of the test could be approximated with a high degree of accuracy even 
for small samples using saddle point approximation. Malevergne et al. [18], who discussed 
whether a lognormal suffices or a power-law distribution shows a better fit for the upper 
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tail of the size distribution of US city size data, concluded that the upper tail of the size 
distribution of US cities is in fact a power law. 

Figure 3 shows the test using the sales distribution found in April 1 2005. Starting from 
the top 10 products we recursively calculated the p- value of the test using Castillo and Puig's 
method. We then calculated the point where the p-value of the test first falls within the 
critical region (in this paper, a — 0.1) minus 1. For the sales distribution found in April 1 
2005 this point is 68. This implies that for the 68 points above this threshold the power-law 
distribution is not rejected and shows that the upper tail of the distribution of sales found 
in April 1 2005 was well fit by a power-law distribution. 

V. DISTRIBUTION ANALYSIS OF SALES 

In a small sample data set, we often observe "power-law behavior" (straight line in the 
cumulative distribution depicted on a double logarithmic scale) even if it were actually 
sampled from a theoretical lognormal distribution. Fig. 4 shows two cases that plot the 
cumulative distribution of synthetic data sets randomly sampled from a theoretical lognormal 
distribution with the same parameters as the sales distribution of digital cameras on April 
1 2005 (Fig. 1). In the one case depicted in the left panel, note that the tail follows a 
lognormal distribution. However, in the other, even if we used the statistical technique 
explained in Section 4, the lower bound estimated from the procedure returns a value of 77 
for the distribution denoted in circles, confirming a power-law behavior at the tail. 

Hence to judge whether distributions found in a certain period are well described by a 
power-law distribution we must consider all the distributions found during that period. The 
left panel of Fig. 5 shows the estimated lower bound from the 107 dates during January 22 
2005 to May 8 2005 and the right panel shows the estimated lower bound of the first 107 
synthetic data sets randomly sampling from a theoretical lognormal distribution with the 
same parameter as the sales distribution found on April 1 2005. The lower bound from the 
real data is clearly quite stable, which proves that the power-law behavior found in the sales 
distribution of April 1 2005 refiected a generating process that produces a genuine power-law 
behavior and not the result of a process that generates a lognormal distribution. Fig. 6 also 
shows their probability density, confirming that the behaviors found in Fig. 5 are also quite 
statistically different. Fig. 7 shows the time evolution of the power-law exponent during 
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January 22 2005 to May 8 2005. The power- law exponent is stable and fluctuates around 
value II — 1.3 ± 0.1 which is quite close to the power law exponent found for city size [18] 
and wealth [22]. The period when the power-law behavior becomes stable was repeatedly 
found and shows that the lognormal distribution does not sufficiently describe the sales 
distribution of digital cameras on a daily basis. 

VI. HOW DISTRIBUTION CHANGES OVER TIME 

Next we focus on all the other dates in our data set. Fig. 8 shows their estimated rank 
thresholds from October 1 2004 to February 29 2008. The period at which we successively 
observed high estimated rank thresholds is not only January 22 2005 to May 8 2005 but 
is also found in other parts of the data. However, there is a period when the estimated 
rank threshold does not behave as if a power-law behavior really exists at its distribution 
tail: period January 16 2006 to August 8 2006 (Fig. 8). Fig. 9 shows a typical cumulative 
distribution observed during this period. The lognormal distribution adequately explains 
the sales distribution for all points. Fig. 10 also shows the histogram of the estimated 
rank threshold of this period. For this period a simple lognormal distribution adequately 
describes the distribution of sales. Therefore we conclude that sales distribution reflects 
when they were observed. 

VII. SALES DYNAMICS OF CONSUMER ELECTRONICS 

It is well known that proportional growth principle applies to firm growth [23]. Fu et al 
showed that not only does this principle apply to firms, but it also applies to different aggre- 
gation of the economy from countries, industry sector, and to products showing theoretically 
that this stems from its elementary sale unit (i.e. sales of products) evolving accroding to 
a random multiplicative process [14,24] . Sakai and Watanabe investigated further this issue 
confirming that dynamics of products determines firms growth by analyzing sales of products 
sold at grocery stores in Japan [25] . Picoli et al used the multiphcative process to model the 
dynamics of circulation of newspapers and magazines [15]. Motivated by these literature we 
would use 

S{t + 1) = \b{t)S{t) + e{t) I e{t) ~ Gaussian{0, a) (5) 
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to describe the underlying dynamics of sales. This assumes a preferential like model for 
sales which requires age and average sales of products to correlate in a exponential fashion 
if the distribution of lifetime is exponential. It is reported that the lifetime distribution 
often follows exponential functions in competitive markets [26]. Although this relationship 
could not be easily verified directly with products itself because lifetime of digital camera is 
short (usually 6 to 12 month) due to product turnover, this could be roughly verified when 
we observe the average daily market share of brands during there lifetime with their age 
(fig.ll). 

If multiplicative noise b is independent of the former size of S, then S leads to a steady 
power-law distribution [27]. However if it is size dependent, S departs from a power- law 
distribution [28]. In this section, we show that sales dynamics follow this multiplicative 
process and use it to reexamine the differences in the sales distribution found in Section 6 
from the usually assumed elements of a stochastic growth process. 

When BCN Inc. collected this dataset, they made new contracts with other stores to 
offer sales data and generated an apparent artificial change of product sales along time. 
To cope with this problem, we introduce normalized sales, Si{t) = Si{t)/ ^^Y17=i Sii't)^ i^" 
stead of actual sales to compare two distant periods. Here, n is number of products in the 
market. All the results in this section can also be reproduced using market share, Si{t) — 
Yli=i^i{'t): ^ well. To begin our empirical investigation we cut the scatter plots of 
both periods into equal logarithmic bins: 0.7 < Siow < 2.175 < S^id < 6.76 < Shigh < 21 
(Fig. 12). The basic idea is to observe whether the distribution of sales growth for one week, 
depends on Si{t). 

We saw in Section 6 that the tail property of the sales distribution follows a power law 
for January 22 2005 to May 8 2005 and a lognormal for January 16 2006 to August 8 2006 
(Fig. 8). Hence, as shown in Fig. 13, we compare the distribution of log sales growth for 
1 week, log^^^^^, observed during periods January 22 2005 to May 8 2005 and January 16 
2006 to August 8 2006. Note that while the positive values of the middle and high areas are 
quite different during January 16 2006 to August 8 2006, they seem to coincide for the log 
growth distributions observed during January 22 2005 to May 8 2005. Note also that log 
growth distribution could be well described as a double exponential distribution and that 
for the negative log growth rates, the probability density coincides. This suggests that while 
the multiplicative term for the high and middle areas is size independent during January 
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22 2005 to May 8 2005, it is size dependent during January 16 2006 to August 8 2006 and 
shows different befiaviors. 

Tfie same observation can also be made using tfie two sample Kormogorv-Smirnov tests. 
Table 1 shows the p- value from the test for two pairs, "high vs middle" and "middle vs 
low", for two periods, January 22 2005 to May 8 2005 and January 16 2006 to August 8, 
respectively. The only pair for which the test does not reject the null hypothesis is the "high 
and middle" pair found in January 22 2005 to May 8 2005. 

Prom these observations, the sales dynamics can be described by the multiplicative pro- 
cess: 



where t describes the time during January 16 2006 to August 8 2006. Here, e{t) ~ 
Gaussian{0, a). Where Si{t) is large, the log growth distribution displays a size-independent 
relationship with Si{t) during January 22 2005 to May 8 2005, but it displays a size- 
dependent relationship with Si{t) during January 16 2006 to August 8 2006. As shown 
by Ishikawa et al. [16], if the log growth distribution is well described by a double expo- 
nential distribution and the probability density coincides for negative log growth rates but 
exhibits a size dependent relationship for positive values, then the multiplicative process de- 
scribed in Eq. (7) will generate a steady lognormal distribution. Recall that during period 
January 16 2006 to August 8 2006 this condition is satisfied. On the other hand, as shown 
by Takayasu et al. [27], Eq. (5) theoretically generates a power-law distribution when the 
growth distribution is independent of Si{t). Therefore while Eq. (6) generates a distribution 
with a power-law tail, Eq. (7) generates a simple lognormal distribution that explains the 
difference in the underlying dynamics for the two periods in which we observed different 
distributions. 




where t describes the time during January 22 2005 to May 8 2005 and 



bioUS{t)) 

S{t + 1) = \b{S{t))S{t) + e{t)\ b{S{t)) = I b^,,{S(t)) 

^ bhigh{S{t)) 



if Sit) < 2.175 

if 2.175 < S{t) < 6.76 

if 6.76 < Sit) 



(7) 
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VIII. CONCLUSION AND FURTHER DISCUSSIONS 



This paper showed how the sales distributions of products evolves when we observed 
them daily. We showed that the distribution of the top ranking products switches between 
lognormal and power-law distributions depending on the timing, suggesting that the un- 
derlying dynamics among these periods differs. This structural difference in the underlying 
dynamics was well established from the usually assumed ingredients of the growth process 
as well providing another source of evidence that the dynamics between these two periods 
differ. Our result is mathematically compatible with Ishikawa et al. [16], who illustrated the 
appearance of both power-law and lognormal distributions under a multiplicative process. 
We only investigated digital cameras in this paper; however such findings can be established 
with many other products in consumer electronics markets as well. 

An interesting question to ponder is why the switch behavior found in Section 6 occurred. 
In our case the main source of the switch can probably be explained by product turnover. 
In product markets such as the digital camera market product life cycle is short taking 
only about 6 to 12 month for a particular brand to change from an old product to new one 
due to product competition. Those product turnover usually takes place on February and 
August before the aggregate demand for digital cameras starts to rise. Fig. 14 shows the 
time evolution of the number of products and the lower bound. As denoted in fig. 14, the 
periods coincide when we observed a steady power-law behavior and a rapid increase in the 
number of products (i.e. when rapid product turnover take place), explaining the switch 
behavior from a lognormal to a power law. When a large number of new products are 
born simultaneously, sales distributions are generated by a mixture of old and new products 
making sales dynamics to be more accurately described as Gibrat's law (i.e. size independent 
growth rate). An empirical study taking these product turnover effect is future work. 

Since researchers are equipped with more detailed data from actual markets we can 
investigate actual market coordination in a more detailed sense. These studies are important 
not only for economics literature but also for physics because such social systems as the 
market are one natural laboratory for investigating coordination under complex systems. 
We hope this line of studies continues to be fruitful for both physics and economics. 
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FIG. 1: Cumulative distribution of sales volume of digital cameras sold on April 1 2005. Slashed 
line shows fitted maximum likelihood estimate assuming all points above 1 obeyed a lognormal 
distribution, and continuous line shows fitted maximum likelihood estimate assuming all points 
above 16 obeyed a power-law distribution. Parameters of both distributions are depicted as well. 
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FIG. 2: Log-size distribution of sales distribution found on April 1 2005 for values over A = log{16). 
Maximum likelihood estimates of both exponential and singly truncated normal distributions are 
depicted along with their parameters. 
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FIG. 3: Right panel depicts p-value of test of null hypothesis where distribution's upper tail is 
power against alternative singly truncated lognormal distribution as a function of rank threshold. 
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FIG. 4: Two examples of randomly sampling from a lognormal distribution with parameters fi = 
1.97,(7 = 1.36. There are 250 points in both distributions. Note the power-law behavior at the 
distribution's tail denoted by circles. Estimated lower bound for crosses is 9 and 77 for circles. 
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FIG. 5: Left panel shows estimated rank threshold from distribution of sales volume from January 
22 2005 to May 9 2005. Right panel shows estimated rank threshold for first 107 synthetic data 
sets {LN{1), LN{2), LN{107)) obtained in experiment. Estimated rank threshold of 7th and 
22nd data sets denoted as A and B are used to depict Fig. 4. 
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FIG. 6: Histogram comparison of estimated rank threshold. Circles denote histogram from syn- 
thetic data sets, and diamonds denote histogram from real data. 
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FIG. 7: Time evolution of power-law exponent found during period January 22 2005 to May 8 
2005. 
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FIG. 8: Time evolution of estimated rank threshold for entire period (October 1 2004 to February 
29 2008). 
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FIG. 9: Cumulative distribution of sales volume of digital cameras sold on March 27 2006. Con- 
tinuous line shows fitted maximum likelihood estimate assuming that all points above 1 obeyed a 
lognormal distribution. Maximum likelihood estimate of parameters is written inside parentheses. 
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FIG. 10: Histogram of estimated rank threshold from real data (January 16 2006 to August 22 
2006). 
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FIG. 11: Average daily market share of brands versus its age in semilog scale. Continuous line 
shows the least squares fit for the exponential hypothesis. 



24 




0.1 L 10 0.1 L 10 

S(i,t) S(i,t) 



FIG. 12: Cutting scatter plots into equal logarithmic bins. Left panel depicts sales of successive 
weeks from period January 22 2005 to May 8 2005, and right panel depicts sales of successive weeks 
from period January 16 2006 to August 8 2006. 
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FIG. 13: Left panel shows distribution of log growth for one week for period January 22 2005 to 
May 8 2005. Right panel shows distribution for January 16 2006 to August 8 2006. Although right 
panel clearly seems size dependent, between the middle and high areas, the left seems to coincide. 
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FIG. 14: Estimated rank threshold for all dates with time evolution of number of products on 
market. 
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KS Test 








Jan 22 - May 8 2005 


Jan 16 -Aug 8 2006 


Low vs Mid 


2.10E-13 


2.58E-14 


Mid vs High 


0.324 


l.OOE-05 



TABLE L Result of two sample Kormogorv-Smirnov test. Numbers inside show p-values. Rows 
represents pairs and columns denote period. Null hypothesis states that two distributions are 
identical, and the alternative states that they are different. 
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